Skip to content

Adding GetILForModule cDAC API #118546

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Adding GetILForModule cDAC API #118546

wants to merge 6 commits into from

Conversation

rcj1
Copy link
Contributor

@rcj1 rcj1 commented Aug 8, 2025

A couple comments on this one:

  • I am assuming that IsMapped() is always true, excluding ReflectionEmit methods.
  • I am also omitting the checks here; they don't seem important for the read-only dumping of IL and they would add a decent amount of complexity to the descriptor and contracts.
    • The first check validates various alignment and flags; to me they look superfluous for read-only operation.
    • The second check ensures that the RVA is valid within the PE file. We omit a check for the IL size in the existing DAC code; it seems we have accepted the risk of this leading to failure on a ReadVirtual if the given size is incorrect/corrupted.

@Copilot Copilot AI review requested due to automatic review settings August 8, 2025 18:37
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements the GetILForModule cDAC API, which retrieves the address of IL (Intermediate Language) code for a given module and relative virtual address (RVA). The implementation replaces a placeholder that delegated to legacy code with a full cDAC-based solution.

Key changes:

  • Adds complete implementation of GetILForModule in SOSDacImpl with proper parameter validation and error handling
  • Introduces new GetILAddr method in the ILoader contract to calculate IL addresses from PE assembly data
  • Includes DEBUG-only verification against legacy implementation to ensure compatibility

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
SOSDacImpl.cs Replaces placeholder with full GetILForModule implementation including validation, error handling, and debug verification
Loader_1.cs Implements GetILAddr method to calculate IL addresses by traversing PE assembly, image, and layout structures
ILoader.cs Adds GetILAddr method signature to the ILoader contract interface

@rcj1 rcj1 requested a review from noahfalk August 8, 2025 18:37
Copy link
Contributor

Tagging subscribers to this area: @steveisok, @dotnet/dotnet-diag
See info in area-owners.md if you want to be subscribed.

@rcj1 rcj1 requested review from max-charlamb and removed request for noahfalk and max-charlamb August 8, 2025 18:51
@rcj1 rcj1 marked this pull request as draft August 8, 2025 18:52
@rcj1 rcj1 marked this pull request as ready for review August 8, 2025 18:57
@rcj1 rcj1 requested review from noahfalk and max-charlamb August 8, 2025 18:57
@jkotas
Copy link
Member

jkotas commented Aug 8, 2025

I am assuming that IsMapped() is always true, excluding ReflectionEmit methods.

That's only true on Windows for files loaded from disk.

IsMapped is typically false on non-Windows for non-R2R binaries. Also, it is likely false for binaries loaded from memory (e.g. using Assembly.Load(byte[]) API) on all platforms.

| `ImageFileHeader` | `SizeOfOptionalHeader` | Size of optional header on disk |
| `ImageSectionHeader` | `VirtualAddress` | Virtual address of the section |
| `ImageSectionHeader` | `VirtualSize` | Virtual size of the section |
| `ImageSectionHeader` | `PointerToRawData` | Offset to section data |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the offsets in these ImageXYZ structures are defined by the PE format specification. Instead of treating them as runtime types that might change over time and need to be described in each build we can consider them immutable constants which any reader is free to hard-code. This should let us remove them from the contract surface area.

Copy link
Contributor Author

@rcj1 rcj1 Aug 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting defining these as global constants in the datadescriptor rather than as type fields, and/or hardcoding from the format specification as opposed to using offsetof()? What do you mean by “reader”?

How do you see the “constants” approach improving upon the current state? In the current state, it is easy to see which offsets belong to which data types, with the type serving as an organizational structure. Perhaps we could have a section of the global dedicated to the IMAGE_* offsets.

@@ -196,6 +200,76 @@ bool ILoader.TryGetLoadedImageContents(ModuleHandle handle, out TargetPointer ba
return true;
}

private static uint AlignUp(uint value, uint alignment)
{
if (alignment == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a stronger check here?

Suggested change
if (alignment == 0)
if (!BitOperations.IsPow2(alignment))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not realize I had that check, Copilot must have snuck it in somehow… The original code does not have this, so I think we can do without it.

while (section < sectionEnd)
{
Data.ImageSectionHeader sectionHeader = _target.ProcessedData.GetOrAdd<Data.ImageSectionHeader>(section);
if (rva < sectionHeader.VirtualAddress + AlignUp(sectionHeader.VirtualSize, sectionAlignment))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that this is mirrors PEDecoder::RvaToSection, but it does not look right.

For images that are not mapped, we can map RVA to physical data only if the RVA is between section->VirtualAddress and section->VirtualAddress + section->SizeOfRawData. If the RVA is outside this range, there is no good mapping. This situation should only ever happen in corrupted images that we do not see in practice.

As written, this logic is going to produce bogus result that will lead to harder to diagnose crash later. I am wondering whether it is something we want to fix.

TargetPointer ntHeadersPtr = FindNTHeaders(imageLayout);
Data.ImageNTHeaders ntHeaders = _target.ProcessedData.GetOrAdd<Data.ImageNTHeaders>(ntHeadersPtr);
Target.TypeInfo type = _target.GetTypeInfo(DataType.ImageNTHeaders);
int offset = type.Fields[nameof(Data.ImageNTHeaders.OptionalHeader)].Offset;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regular runtime data structures are in platform native format. These Image... data structures are always in little endian. Do we have any prior art in the cDAC for dealing with endianess?

This is not actual problem at the moment since all architectures supported by CoreCLR are little endian currently. cDAC should be designed to bridge endianess differences for future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current target Read method reads in target endianness; it would be easy to allow the caller to optionally specify the endianness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants